Boynton Beach
Exploring Precision and Recall to assess the quality and diversity of LLMs
Bronnec, Florian Le, Verine, Alexandre, Negrevergne, Benjamin, Chevaleyre, Yann, Allauzen, Alexandre
We introduce a novel evaluation framework for Large Language Models (LLMs) such as \textsc{Llama-2} and \textsc{Mistral}, focusing on importing Precision and Recall metrics from image generation to text generation. This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora. By conducting a comprehensive evaluation of state-of-the-art language models, the study reveals new insights into their performance on open-ended generation tasks, which are not adequately captured by traditional benchmarks. The findings highlight a trade-off between the quality and diversity of generated samples, particularly when models are fine-tuned on instruction dataset or with human feedback. This work extends the toolkit for distribution-based NLP evaluation, offering insights into the practical capabilities and challenges that current LLMs face in generating diverse and high-quality text. We release our code and data.
- Europe > Russia (0.14)
- Asia > Russia (0.14)
- Europe > Poland > Masovia Province > Warsaw (0.04)
- (15 more...)
- Personal > Honors (1.00)
- Research Report > New Finding (0.92)
- Media (1.00)
- Health & Medicine (0.68)
- Government > Military (0.67)
- Leisure & Entertainment > Sports > Soccer (0.67)
The Future Of Work Now: AI-Assisted Skin Imaging
One of the most frequently-used phrases at business events these days is "the future of work." It's increasingly clear that artificial intelligence and other new technologies will bring substantial changes in work tasks and business processes. But while these changes are predicted for the future, they're already present in many organizations for many different jobs. The situation brings to mind the William Gibson comment, "The future is already here--it's just not evenly distributed." The jobs and work processes described below are an example of this phenomenon.
- North America > United States > Florida > Palm Beach County > Boynton Beach (0.05)
- Europe > United Kingdom (0.05)
- Europe > Denmark (0.05)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Dermatology (1.00)
Innovative AI and Machine-Learning Technology That Detects Emotion Wins Top Award
CampaignTester was awarded Best Application of Artificial Intelligence to Optimize Creative at the 2020 Campaigns & Elections Reed Awards. CampaignTester is a cutting-edge mobile-based platform that utilizes emotion analytics and machine learning to detect a user's emotion and engagement level while watching video content. Their proprietary platform aims to deliver key audience insights for organizations to validate, revise and perfect their video content messaging. Campaigns & Elections Reed Award winners represent the "best-of-the-best" in the political campaign and advocacy industries. The 2020 Reed Awards honored winners across 16 distinct category groups, representing the different specialisms of the political campaign industry, with distinct category groups for International (non-US) work, and Grassroots Advocacy work.
- Personal > Honors (0.79)
- Press Release (0.71)
The End of the End of the World
Two years ago, a lawyer in Indiana sent me a check for seventy-eight thousand dollars. The money was from my uncle Walt, who had died six months earlier. I hadn't been expecting any money from Walt, still less counting on it. So I thought I should earmark my inheritance for something special, to honor Walt's memory. It happened that my longtime girlfriend, a native Californian, had promised to join me on a big vacation. She'd been feeling grateful to me for understanding why she had to return full time to Santa Cruz and look after her mother, who was ninety-four and losing her short-term memory. She'd said to me, impulsively, "I will take a trip with you anywhere in the world you've always wanted to go." To this I'd replied, for reasons I'm at a loss to reconstruct, "Antarctica?" Her eyes widened in a way that I should have paid closer attention to. But a promise was a promise. Hoping to make Antarctica more palatable to my temperate Californian, I decided to spend Walt's money on the most deluxe of bookings--a three-week Lindblad National Geographic expedition to Antarctica, South Georgia island, and the Falklands. I paid a deposit, and the Californian and I proceeded to joke, uneasily, when the topic arose, about the nasty cold weather and the heaving South Polar seas to which she'd consented to subject herself. I kept reassuring her that as soon as she saw a penguin she'd be happy she'd made the trip. But when it came time to pay the balance, she asked if we might postpone by a year. Her mother's situation was unstable, and she was loath to put herself so irretrievably far from home. By this point, I, too, had developed a vague aversion to the trip, an inability to recall why I'd proposed Antarctica in the first place. The idea of "seeing it before it melts" was dismal and self-cancelling: why not just wait for it to melt and cross itself off the list of travel destinations? I was also put off by the seventh continent's status as a trophy, too remote and expensive for the common tourist to set foot on. It was true that there were extraordinary birds to be seen, not just penguins but oddities like the snowy sheathbill and the world's southernmost-breeding songbird, the South Georgia pipit. But the number of Antarctic species is fairly small, and I'd already reconciled myself to never seeing every bird species in the world. The best reason I could think of for going to Antarctica was that it was absolutely not the kind of thing the Californian and I did; we'd learned that our ideal getaway lasts three days.
- North America > United States > Indiana (0.24)
- Asia > China (0.05)
- North America > United States > Minnesota (0.04)
- (18 more...)
- Transportation (1.00)
- Consumer Products & Services > Travel (0.66)
- Health & Medicine > Therapeutic Area (0.46)
- (4 more...)